Automatic Gating of flow cytometry data

Automated gating was able to match the performance of central manual analysis for all tested panels, exhibiting little to no bias and comparable variability. Standardized staining, data collection, and automated gating can increase power, reduce variability, and streamline analysis for immunophenotyping.

The two top performing gating algorithms - OpenCyto (v. 1.7.4), flowDensity (v. 1.4.0) - in a study run by the FlowCAP consortium aimed at selecting the best performing algorithms for this larger study were chosen for the analysis presented in this paper.

Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium Finak, Langweiler, Jaimes, et al. (2016)

OpenCyto Finak, Frelinger, Jiang, et al. (2014)

We propose to use OpenCyto to perform systematic and reproducible gating of 28 immune cell subsets. Gating is standardized via a .csv file describing the algorithmic approach for each step of the gating hierarchy. Importantly, this methodology allows for unbiased gating of thousands of samples producing results that are interpretable and labelled populations

OpenCyto gives the user many options to refine algorithmic parameters to improve the performance of each step in the gating hierarchy. We evaluated the performance of our OpenCyto template using internal data for 151 manually gated (Jflow software) samples across 15 gates. The global correlation between the population counts of manual and OpenCyto gating was high (rho=0.9846 ,p-value <2e-16). Despite a high global concordance, certain subsets were less well correlated (e.g Activated CD4 counts, rho=0.6222 ,p-value <2e-16).

While OpenCyto can automate the classification of known subsets by following a traditional gating hierarchy, it does not easily facilitate the discovery of novel populations.

Novel subsets

t-SNE is a visualization method, and not sure if it can be directly used for novel subset detection

## quartz_off_screen 
##                 2

PhenoGraph

(Levine, Simonds, Bendall, et al., 2015)

PhenoGraph performs unsupervised clustering of high dimensional single cell data allowing for the discovery of novel subtypes. We propose to use OpenCyto to first limit our search space (e.g starting from live, single T-Cells) and then search for novel populations within the clean subset.

PhenoGraph produces results that are numerically labelled populations, but do not have an immediate interpretation. In order interpret the PhenoGraph results, we will compute the MEM Diggins, Greenplate, Leelatian, et al. (2017) score of each PhenoGraph cluster allowing for (magical) comparisons between cases and controls.

unique(data_xk_all$V1)
Lymphocytes (SSC-A v FSC-A)=1
central memory cytotoxic Tcells (CCR7+ , CD45RA-)=15
activated helper Tcells (CD4+ HLA-DR+)=17
effector memory cytotoxic Tcells (CD95+ CD28-)=18
central memory helper Tcells (CD95+, CD28+)=19
Single Cells (FSC-H v FSC-W)=2
effector memory helper Tcells (CD95+, CD28-)=20
activated cytotoxic Tcells (CD8+ HLA-DR+)=21
EM2 cytotoxic Tcells (CD27+ CD28-)=22
EM4 cytotoxic Tcells (CD27- CD28+)=23
pE1 cytotoxic Tcells (CD27+ CD28+)=24
effector memory helper Tcells (CCR7- CD45RA-)=25
naive cytotoxic Tcells (CD95- CD28+)=26
naive helper Tcells (CCR7+ CD45RA+)=27
EM3 cytotoxic Tcells (CD27- CD28-)=28
pE cytotoxic Tcells (CD27- CD28-)=29
Live cells (PE-)=3
naive cytotoxic Tcells (CCR7+ , CD45RA+)=30
central memory cytotoxic Tcells (CD95+ CD28+)=31
EM1 cytotoxic Tcells (CD27+ CD28+)=32
pE2 cytotoxic Tcells (CD27+ , CD28-)=33
effector helper Tcells (CCR7- CD45RA+)=34
cytotoxic Tcells CD27- , CD28+=36
central memory helper Tcells (CCR7+ CD45RA-)=37
naive helper Tcells (CD95-, CD28+)=38
Tcells (CD3+ CD19-)=6
cytotoxic Tcells-CD8+=7

Citrus

Bruggner, Bodenmiller, Dill, et al. (2014)

May be good choice for Aim 1:

Citrus (cluster identification, characterization, and regression), a data-driven approach for the identification of stratifying subpopulations in multidimensional cytometry datasets.

Citrus was designed to detect stratifying cell populations between cases and controls.